Applications of Weighted Automata in Natural Language Processing
نویسندگان
چکیده
Linguistics and automata theory were at one time tightly knit. Very early on, finite-state processes were used by Markov [35, 27] to predict sequences of vowels and consonants in novels by Pushkin. Shannon [48] extended this idea to predict letter sequences of English words using Markov processes. While many theorems about finite-state acceptors (FSAs) and finite-state transducers (FSTs) were proven in the 1950s, Chomsky argued that such devices were too simple to adequately describe natural language [6]. Chomsky employed context-free grammars (CFGs) and then introduced the more powerful transformational grammars (TG), loosely defined in [7]. In attempting to formalize TG, automata theorists like Rounds [46] and Thatcher [52] introduced the theory of tree transducers. Computational linguistics also got going in earnest, with Woods’ use of augmented transition networks (ATNs) for automatic natural language parsing. In the final paragraph of his 1973 tree automata survey [53], Thatcher wrote:
منابع مشابه
Tiburon: A Weighted Tree Automata Toolkit
The availability of weighted finite-state string automata toolkits made possible great advances in natural language processing. However, recent advances in syntax-based NLP model design are unsuitable for these toolkits. To combat this problem, we introduce a weighted finite-state tree automata toolkit, which incorporates recent developments in weighted tree automata theory and is useful for na...
متن کاملNLP Applications Based on Weighted Multi-Tape Automata
This article describes two practical applications of weighted multi-tape automata (WMTAs) in Natural Language Processing, that demonstrate the augmented descriptive power of WMTAs compared to weighted 1-tape and 2-tape automata. The two examples concern the preservation of intermediate results in transduction cascades and the search for similar words in two languages. As a basis for these appli...
متن کاملWeighted Automata in Text and Speech Processing
Finite-state automata are a very effective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned weights or costs. We briefly describe some of the main theoretical and algorithmic aspects of these machines. In particular, we describe an efficient composition alg...
متن کاملWeighted Automata in Text
Processing Mehryar Mohri, Fernando Pereira and Michael Riley AT&T Research 600 Mountain Avenue Murray Hill, 07974 NJ fmohri,pereira,[email protected] Abstract. Finite-state automata are a very e ective tool in natural language processing. However, in a variety of applications and especially in speech precessing, it is necessary to consider more general machines in which arcs are assigned ...
متن کاملMinimizing Deterministic Weighted Tree Automata
The problem of efficiently minimizing deterministic weighted tree automata (wta) is investigated. Such automata have found promising applications as language models in Natural Language Processing. A polynomial-time algorithm is presented that given a deterministic wta over a commutative semifield, of which all operations including the computation of the inverses are polynomial, constructs an eq...
متن کاملParsing Algorithms based on Tree Automata
We investigate several algorithms related to the parsing problem for weighted automata, under the assumption that the input is a string rather than a tree. This assumption is motivated by several natural language processing applications. We provide algorithms for the computation of parse-forests, best tree probability, inside probability (called partition function), and prefix probability. Our ...
متن کامل